Goto

Collaborating Authors

 Gallatin County


A Tyrannosaurus tooth embedded in dinosaur skull tells a violent story

Popular Science

First discovered 20 years ago, the rare fossil combo reveals a Cretaceous meal in the making. Breakthroughs, discoveries, and DIY tips sent six days a week. A rare dinosaur fossil on display at the Museum of the Rockies in Bozeman, Montana, tells a gory story. The skull from a large plant-eating has a tooth lodged into it, indicating that it may have met its final moments as a meal. The tooth in question belongs to one of the most famous dinosaurs on earth-- .



Detecting Backdoor Attacks via Similarity in Semantic Communication Systems

arXiv.org Artificial Intelligence

Semantic communication systems, which leverage Generative AI (GAI) to transmit semantic meaning rather than raw data, are poised to revolutionize modern communications. However, they are vulnerable to backdoor attacks, a type of poisoning manipulation that embeds malicious triggers into training datasets. As a result, Backdoor attacks mislead the inference for poisoned samples while clean samples remain unaffected. The existing defenses may alter the model structure (such as neuron pruning that potentially degrades inference performance on clean inputs, or impose strict requirements on data formats (such as ``Semantic Shield" that requires image-text pairs). To address these limitations, this work proposes a defense mechanism that leverages semantic similarity to detect backdoor attacks without modifying the model structure or imposing data format constraints. By analyzing deviations in semantic feature space and establishing a threshold-based detection framework, the proposed approach effectively identifies poisoned samples. The experimental results demonstrate high detection accuracy and recall across varying poisoning ratios, underlining the significant effectiveness of our proposed solution.


Trump names several new White House picks to work on AI, crypto and more: 'America First Patriots'

FOX News

A panel joins'Fox News @ Night' to weigh in on a voter sentiment poll about the incoming Trump administration, Chinese President Xi Jinping's invitation to the presidential inauguration, and efforts by Trump Cabinet nominees to court senators. President-elect Donald Trump unleashed a slew of nominations on Sunday night, naming several new people to serve in his forthcoming administration. In several Truth Social posts on Sunday, Trump introduced various experts to work in the White House on issues ranging from defense to technology to budgeting. The Republican leader began by naming Stephen Alexander Vaden as his nominee for deputy secretary of the Department of Agriculture. "In my First Term, Stephen was the General Counsel of the Department of Agriculture, and a Member of the Board of the Commodity Credit Corporation, where he won two cases before the United States Supreme Court, relocated and reorganized the Agencies that comprise the Department to better serve Rural America, and engaged in substantial regulatory reform," Trump wrote in a post.


Adaptive Sampling to Reduce Epistemic Uncertainty Using Prediction Interval-Generation Neural Networks

arXiv.org Machine Learning

Obtaining high certainty in predictive models is crucial for making informed and trustworthy decisions in many scientific and engineering domains. However, extensive experimentation required for model accuracy can be both costly and time-consuming. This paper presents an adaptive sampling approach designed to reduce epistemic uncertainty in predictive models. Our primary contribution is the development of a metric that estimates potential epistemic uncertainty leveraging prediction interval-generation neural networks. This estimation relies on the distance between the predicted upper and lower bounds and the observed data at the tested positions and their neighboring points. Our second contribution is the proposal of a batch sampling strategy based on Gaussian processes (GPs). A GP is used as a surrogate model of the networks trained at each iteration of the adaptive sampling process. Using this GP, we design an acquisition function that selects a combination of sampling locations to maximize the reduction of epistemic uncertainty across the domain. We test our approach on three unidimensional synthetic problems and a multi-dimensional dataset based on an agricultural field for selecting experimental fertilizer rates. The results demonstrate that our method consistently converges faster to minimum epistemic uncertainty levels compared to Normalizing Flows Ensembles, MC-Dropout, and simple GPs.


Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning

arXiv.org Artificial Intelligence

A vast amount of scholarly work is published daily, yet much of it remains inaccessible to the general public due to dense jargon and complex language. To address this challenge in science communication, we introduce a reinforcement learning framework that fine-tunes a language model to rewrite scholarly abstracts into more comprehensible versions. Guided by a carefully balanced combination of word- and sentence-level accessibility rewards, our language model effectively substitutes technical terms with more accessible alternatives, a task which models supervised fine-tuned or guided by conventional readability measures struggle to accomplish. Our best model adjusts the readability level of scholarly abstracts by approximately six U.S. grade levels -- in other words, from a postgraduate to a high school level. This translates to roughly a 90% relative boost over the supervised fine-tuning baseline, all while maintaining factual accuracy and high-quality language. An in-depth analysis of our approach shows that balanced rewards lead to systematic modifications in the base model, likely contributing to smoother optimization and superior performance. We envision this work as a step toward bridging the gap between scholarly research and the general public, particularly younger readers and those without a college degree.


Time-Series Forecasting, Knowledge Distillation, and Refinement within a Multimodal PDE Foundation Model

arXiv.org Artificial Intelligence

Symbolic encoding has been used in multi-operator learning as a way to embed additional information for distinct time-series data. For spatiotemporal systems described by time-dependent partial differential equations, the equation itself provides an additional modality to identify the system. The utilization of symbolic expressions along side time-series samples allows for the development of multimodal predictive neural networks. A key challenge with current approaches is that the symbolic information, i.e. the equations, must be manually preprocessed (simplified, rearranged, etc.) to match and relate to the existing token library, which increases costs and reduces flexibility, especially when dealing with new differential equations. We propose a new token library based on SymPy to encode differential equations as an additional modality for time-series models. The proposed approach incurs minimal cost, is automated, and maintains high prediction accuracy for forecasting tasks. Additionally, we include a Bayesian filtering module that connects the different modalities to refine the learned equation. This improves the accuracy of the learned symbolic representation and the predicted time-series.


Revisiting the Exit from Nuclear Energy in Germany with NLP

arXiv.org Artificial Intelligence

Annotation of political discourse is resource-intensive, but recent developments in NLP promise to automate complex annotation tasks. Fine-tuned transformer-based models outperform human annotators in some annotation tasks, but they require large manually annotated training datasets. In our contribution, we explore to which degree a manually annotated dataset can be automatically replicated with today's NLP methods, using unsupervised machine learning and zero- and few-shot learning.


Simplifying Scholarly Abstracts for Accessible Digital Libraries

arXiv.org Artificial Intelligence

Making science more accessible remains a challenge even with much effort devoted on the producer and publisher side. As content producers, researchers are encouraged to engage directly with the public, either through social media (Davies, 2008; Hara et al., 2019; Knox and Hara, 2021) or by crafting more digestible manuscripts in research (Maurer et al., 2021) and practice (Grene et al., 2017). Funding agencies and renowned journals also encourage the communication of scientific findings in accessible language. For instance, the National Institutes of Health (NIH) advocate "clear and simple" principles when communicating with audiences with limited health literacy, and the Proceedings of the National Academy of Sciences of the United States of America (PNAS) requires authors to submit a significance statement accessible to non-experts (Berenbaum, 2021; Pool et al., 2021). As scientific research progresses with increased specialization and interdisciplinarity, it is acknowledged that the use of jargon effectively reduces communication costs among domain experts, particularly those responsible for reviewing submissions. This specialized language, however, can become incomprehensible to those without a similar research background. While efforts to share scientific findings in more accessible language from the producer side are gaining traction, widespread adoption is unlikely in the near future due to the inherent conflicts between the specialized nature of scholarly communication and the public-oriented dissemination of scientific findings. Within this effort to create understandable research findings and open science to broader communities, libraries--and our digital libraries in particular--have a role to play. Driven by this idea, we propose to start by improving the readability of abstracts from scholarly works through automated rewriting.


Counterfactual Analysis of Neural Networks Used to Create Fertilizer Management Zones

arXiv.org Artificial Intelligence

In Precision Agriculture, the utilization of management zones (MZs) that take into account within-field variability facilitates effective fertilizer management. This approach enables the optimization of nitrogen (N) rates to maximize crop yield production and enhance agronomic use efficiency. However, existing works often neglect the consideration of responsivity to fertilizer as a factor influencing MZ determination. In response to this gap, we present a MZ clustering method based on fertilizer responsivity. We build upon the statement that the responsivity of a given site to the fertilizer rate is described by the shape of its corresponding N fertilizer-yield response (N-response) curve. Thus, we generate N-response curves for all sites within the field using a convolutional neural network (CNN). The shape of the approximated N-response curves is then characterized using functional principal component analysis. Subsequently, a counterfactual explanation (CFE) method is applied to discern the impact of various variables on MZ membership. The genetic algorithm-based CFE solves a multi-objective optimization problem and aims to identify the minimum combination of features needed to alter a site's cluster assignment. Results from two yield prediction datasets indicate that the features with the greatest influence on MZ membership are associated with terrain characteristics that either facilitate or impede fertilizer runoff, such as terrain slope or topographic aspect.